The genome sequence of Ramsons hoverfly, Portevinia maculata (Fallén, 1817)

We present a genome assembly from an individual Portevinia maculata (Ramsons hoverfly; Arthropoda; Insecta; Diptera; Syrphidae). The genome sequence is 1,125.3 megabases in span. Most of the assembly is scaffolded into 6 chromosomal pseudomolecules, including the X sex chromosome. The mitochondrial genome has also been assembled and is 18.98 kilobases in length. Gene annotation of this assembly on Ensembl identified 24,849 protein coding genes.


Background
Portevinia maculata (Fallén, 1817), also known as the Ramsons hoverfly, is a Northern and Central European hoverfly species.It is widespread but localised in the UK (Ball & Morris, 2015;van Veen, 2014).Individuals occur in woodlands and mature hedgerows, specifically associated with areas rich in Ramsons, Allium ursinum (Ball & Morris, 2015).The larvae develop within the host plant bulbs and stem bases, hence their common name, the Ramsons hoverfly.It is a medium sized, dark hoverfly with silver-grey square shaped dust-spots on the 2 nd and 3 rd tergites (Ball & Morris, 2015).The species additionally displays characteristic facial features including a rounded facial knob and bright orange antennae (Stubbs & Falk, 2002).This set of distinctive morphological features along with their habitual presence in the proximity of Ramsons plants makes the species relatively straightforward to identify (Ball & Morris, 2015).
Larvae are difficult to detect in bulbs until January to March time when they are actively growing and consequently have capacity to cause notable damage and discolouration to bulbs (Rotheray, 1993;Stubbs & Falk, 2002).The adult flight period occurs from April to July, peaking in numbers from mid-May to early June, corresponding with when Ramsons are blooming (Stubbs & Falk, 2002).The majority of records for the Ramsons hoverfly are of males found basking on sunny Ramsons leaves in woodland, frequently holding their wings in a distinguishing delta position (Ball & Morris, 2015;Stubbs & Falk, 2002).Contrastingly, females are more elusive and are believed to spend a large proportion of time out of sight, close to the woodland floor amongst low growing foliage (Ball & Morris, 2000;Ball & Morris, 2015;Stubbs & Falk, 2002).The chromosomally complete genome sequence for Portevinia maculata as part of the collaborative Darwin Tree of Life Project offers an opportunity to investigate and enhance our knowledge of this behaviourally and phenotypically distinct hoverfly species.

Genome sequence report
The genome was sequenced from one Portevinia maculata (Figure 1) collected from United Kingdom | Berkshire | Wytham Woods (51.78,.A total of 29-fold coverage in Pacific Biosciences single-molecule HiFi long reads was generated.Primary assembly contigs were scaffolded with chromosome conformation Hi-C data.Manual assembly curation corrected 40 missing joins or mis-joins and removed one haplotypic duplication, reducing the scaffold number by 32.61%, and increasing the scaffold N50 by 76.35%. The final assembly has a total length of 1125.3Mb in 61 sequence scaffolds with a scaffold N50 of 310.9 Mb (Table 1).The snailplot in Figure 2 provides a summary of the assembly statistics, while the distribution of assembly scaffolds on GC proportion and coverage is shown in Figure 3.The cumulative assembly plot in Figure 4 shows curves for subsets of scaffolds assigned to different phyla.Most (99.04%) of the assembly sequence was assigned to 6 chromosomal-level scaffolds, representing 5 autosomes and the X sex chromosome.Chromosome-scale scaffolds confirmed by the Hi-C data are named in order of size (Figure 5; Table 2).The scaffolds making up Chromosome 5 have half coverage read mapping data.This may indicate that they are the Y chromosome.While not fully phased, the assembly deposited is of one haplotype.Contigs corresponding to the second haplotype have also been deposited.The mitochondrial genome was also assembled and can be found as a contig within the multifasta file of the genome submission.

Sample acquisition and nucleic acid extraction
A specimen of Portevinia maculata (specimen ID Ox001377, ToLID idPorMacu1) was netted in Wytham Woods, Oxfordshire, UK (latitude 51.78, longitude -1.34) on 2021-05-27.The specimen was collected and identified by Liam Crowley (University of Oxford) and preserved on dry ice.
The workflow for high molecular weight (HMW) DNA extraction at the Wellcome Sanger Institute (WSI) includes a sequence of core procedures: sample preparation; sample homogenisation, DNA extraction, fragmentation, and clean-up.
In sample preparation, the idPorMacu1 sample was weighed and dissected on dry ice (Jay et al., 2023).Tissue from the thorax was homogenised using a PowerMasher II tissue disruptor (Denton et al., 2023a).HMW DNA was extracted using the Automated MagAttract v1 protocol (Oatley et al., 2023).HMW DNA was sheared into an average fragment size of 12-20 kb in a Megaruptor 3 system with speed setting 30 (Todorovic et al., 2023).Sheared DNA was purified by solid-phase reversible immobilisation (Strickland et al., 2023): in brief, the method employs a 1.8X ratio of AMPure PB beads to sample to eliminate shorter fragments and concentrate the DNA.The concentration of the sheared and purified DNA was assessed using a Nanodrop spectrophotometer and Qubit Fluorometer and Qubit dsDNA High Sensitivity Assay kit.
Fragment size distribution was evaluated by running the sample on the FemtoPulse system.
Protocols developed by the Wellcome Sanger Institute (WSI) Tree of Life core laboratory are available on protocols.io(Denton et al., 2023b).

Sequencing
Pacific Biosciences HiFi circular consensus DNA sequencing libraries were constructed according to the manufacturers' instructions.DNA sequencing was performed by the Scientific  ( Bernt et al., 2013) and uses these annotations to select the final mitochondrial contig and to ensure the general quality of the sequence.
Table 3 contains a list of relevant software tool versions and sources.

Genome annotation
The BRAKER2 pipeline (Brůna et al., 2021) was used in the default protein mode to generate annotation for the Portevinia maculata assembly (GCA_949715645.1) in Ensembl Rapid Release.

Wellcome Sanger Institute -Legal and Governance
The materials that have contributed to this genome note have been supplied by a Darwin Tree of Life Partner.The submission of materials by a Darwin Tree of Life Partner is subject to the 'Darwin Tree of Life Project Sampling Code of Practice', which can be found in full on the Darwin Tree of  Life website here.By agreeing with and signing up to the Sampling Code of Practice, the Darwin Tree of Life Partner agrees they will meet the legal and ethical requirements and standards set out within this document in respect of all samples acquired for, and supplied to, the Darwin Tree of Life Project.
Further, the Wellcome Sanger Institute employs a process whereby due diligence is carried out proportionate to the nature of the materials themselves, and the circumstances under which they have been/are to be collected and provided for use.The purpose of this is to address and mitigate any potential legal and/or ethical implications of receipt and use of the materials as part of the research project, and to ensure that in doing so we align with best practice wherever possible.The overarching areas of consideration are: • Ethical review of provenance and sourcing of the material

Bhagya Thimmappa
USDA Agricultural Research Service, Chatsworth, NEW JERSEY, USA In the work "The genome sequence of Ramsons hoverfly, Portevinia maculata," the authors reported nuclear genome assembly, annotation, and mitochondrial genome assembly for the Ramsons hoverfly fly.Ramsons hoverfly is localized to the UK and reported to damage host plant bulbs.Reported assembly and annotation resources are trivial in understanding these flies further.Overall, the manuscript is well written.

Minor comments:
Sequenced fly is male or female? 1.
It would be great to mention briefly how gene modules are predicted.Was RNA-Seq data used to support predicted gene modules?This will help the reader decide how good the gene prediction is.

2.
Also, does the annotation have alternative splicing information?3.
It appears the BUSCO value mentioned is for the genome assembly.Please clarify this in the table and provide the BUSCO value for structural annotation so we have some idea of how good the structural annotation is.

Are sufficient details of methods and materials provided to allow replication by others? Partly
Are the datasets clearly presented in a useable and accessible format?

Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Genomics, Bioinformatics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
The introduction provides good background on P. maculate but can benefit if more details on the importance of this study are included.A clear research objective sentence that would help frame the study.

2.
Overall, the manuscript presents a valuable contribution to the field of genomics with a highquality genome assembly and annotation of Portevinia maculata.Addressing the points mentioned above would strengthen the manuscript and enhance its impact.

Is the rationale for creating the dataset(s) clearly described? Partly
Are the protocols appropriate and is the work technically sound?Yes

Are sufficient details of methods and materials provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format?

Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Single cell omics, Genomics, Bioinformatics, Sequencing, Developmental Neurobiology I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Nikoletta Andrea Nagy University of Debrecen, Debrecen, Hungary
The manuscript is well written, and the genome presented is of high quality and can contribute to future research on hoverflies.

Minor comments:
I would recommend referring to Figure 1 in the description of the species in the Background section.Reviewer Expertise: Evolutionary ecology, entomology.Particularly expertise in the Lepidoptera (butterflies and moths).
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Figure 2 .Figure 3 .
Figure 2. Genome assembly of Portevinia maculata, idPorMacu1.1:metrics.The BlobToolKit Snailplot shows N50 metrics and BUSCO gene completeness.The main plot is divided into 1,000 size-ordered bins around the circumference with each bin representing 0.1% of the 1,125,327,955 bp assembly.The distribution of scaffold lengths is shown in dark grey with the plot radius scaled to the longest scaffold present in the assembly (406,294,527 bp, shown in red). .Orange and pale-orange arcs show the N50 and N90 scaffold lengths (310,928,733 and 139,547,270 bp), respectively.The pale grey spiral shows the cumulative scaffold count on a log scale with white scale lines showing successive orders of magnitude.The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT and N percentages in the same bins as the inner plot.A summary of complete, fragmented, duplicated and missing BUSCO genes in the diptera_odb10 set is shown in the top right.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/idPorMacu1_1/dataset/idPorMacu1_1/snail.

Figure 4 .
Figure 4. Genome assembly of Portevinia maculata, idPorMacu1.1:BlobToolKit cumulative sequence plot.The grey line shows cumulative length for all scaffolds.Coloured lines show cumulative lengths of scaffolds assigned to each phylum using the buscogenes taxrule.An interactive version of this figure is available at https://blobtoolkit.genomehubs.org/view/idPorMacu1_1/dataset/idPorMacu1_1/cumulative.

Figure 5 .
Figure 5. Genome assembly of Portevinia maculata, idPorMacu1.1:Hi-C contact map of the idPorMacu1.1 assembly, visualised using HiGlass.Chromosomes are shown in order of size from left to right and top to bottom.An interactive version of this figure may be viewed at https://genome-note-higlass.tol.sanger.ac.uk/l/?d=Otl7w1WoTVCakMcMc11ytQ.

Reviewer
Report 18 September 2024 https://doi.org/10.21956/wellcomeopenres.22853.r98669© 2024 Andrea Nagy N.This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

○
Include the sex of the specimen used for DNA extraction.○ Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.Reviewer Expertise: Insect genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Reviewer Report 17 September 2024 https://doi.org/10.21956/wellcomeopenres.22853.r98671© 2024 Koblmüller S. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Stephan Koblmüller University of Graz, Graz, Austria In this data note, Crowley et al. present the genome assembly of Ramson's hoverfly, a hoverfly species occurring in Northern and Central Europe.The background section gives a concise, but nonetheless informative, introduction to the study species.The data are clear, and the methods are well described.I have no objections to any of the data presented or the results.As far as I can tell, the datasets are deposited in publicly accessible repositories.Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.Reviewer Expertise: phylogenetics-/genomics, population genetics/genomics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Juan Wulff North Carolina State University, Raleigh, USA The manuscript is well written, concise but clear enough, following the style of a short communication or a brief report.The background provided is sufficient to understand the model under study, and the methodology used was a combination of deep sequencing tools to obtain a high-quality genome assembly of Portevinia maculata.I recommend the authors improving the gene annotation in future studies.Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.Reviewer Expertise: Entomology; Molecular Biology; Genetics I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.Is the rationale for creating the dataset(s) clearly described?Yes Are the protocols appropriate and is the work technically sound?Yes Are sufficient details of methods and materials provided to allow replication by others?Yes Are the datasets clearly presented in a useable and accessible format?Yes Competing Interests: No competing interests were disclosed.